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Abstract 

We present a framework, called AIR, for verifying safety properties of assembly language pro- 
grams via software model checking. AIR extends the applicability of predicate abstraction and 
counterexample guided abstraction refinement to the automated verification of low-level software. 
By working at the assembly level, AIR allows verification of programs for which source code is 
unavailable-such as legacy and COTS software-and programs that use features-such as pointers, 
structures, and object-orientation-that are problematic for source-level software verification tools. 
In addition, AIR makes no assumptions about the underlying compiler technology. We have imple- 
mented a prototype of AIR and present encouraging results on several non-trivial examples. 


1 Introduction 

Over the past decade, there has been considerable advancement in the theory and practice of automated 
formal software verification. One of the most promising paradigms to emerge in this area is software 
model checking (SMC) [3] - a combination of counterexample guided abstraction refinement [111 with 
predicate abstraction [18]. SMC verifies that a program P satisfies a specification 4> iteratively, as follows: 

1 . (Abstraction) Construct a conservative model M from P via predicate abstraction. Go to Step 2. 

2. (Verification) Model check M f= 0. If this is the case, then terminate with result P f= (j). Otherwise 
let CE be a counterexample to M \= <j> returned by the model checker. Go to Step 3. 

3. (Validation) Check whether CE corresponds to some concrete behavior of P. If this is the case, 
then we obtain a real counterexample and terminate with P <f>. Otherwise, CE is a spurious 
counterexample. Go to Step 4. 

4. (Refinement) Construct a more precise model M' that does not admit CE as a behavior and repeat 
from Step 2 with M — M'. 

Variations of the above process have been investigated by several research groups [3, 19, 9] with con- 
siderable success on source code derived from real-life examples. However, there has been considerably 
less work on applying SMC to verify machine-level programs. In this paper, we show that in spite of 
the absence of high-level information, such as variable names and branch conditions, the effectiveness 
of SMC extends to even low-level software. More specifically, we present a SMC-based procedure for 
verifying safety properties of PowerPC™ assembly programs. Our approach, which we call Assembly 
Iterative Refinement or AIR, consists of two broad phases: 

1 . Decompilation: In this stage, we translate the target assembly program A and safety property </>a 
into an equivalent C program P and safety property (j)p. In essence, we treat each register as a 
global variable. Each procedure in A leads to a corresponding procedure in P, and each assembly 
instruction produces one or more C statements. We present further details of the decompilation 
process in Section 3. Note that C is particularly suited as the target language of decompilation 
because: (i) C supports bit-level operations that are critical for preserving the semantics of assem- 
bly instructions during decompilation; (ii) we are able to build on existing infrastructures for C 
verification to perform the next stage of AIR. 

E. Denney, D. Giannakopoulou, C.S. Pasareanu (eds.); The First NASA Formal Methods Symposium, pp. 36-45 
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2. Verification: In this stage, we use SMC to check P |= < pp . Since decompilation is semantics- 
preserving, the result obtained in this stage for P also applies to the original assembly program 
A. Furthermore, any counterexample obtained with respect to P is transformed to a corresponding 
counterexample with respect to A. Further details about verification can be found in Section 4. 

AIR extends the applicability of SMC to assembly program verification, and yields several tangible 
benefits. First, AIR does not require source code, and thus is applicable to software - such as legacy, 
proprietary, and commercial-off-the-shelf (COTS) software - for which source code is often not avail- 
able. Second, unlike source code analysis, AIR analyzes exactly what is to be executed and makes no 
assumptions about any compiler technology being used. Thus, it eliminates the need to ensure compiler 
correctness, and reduces the size of the trusted computing base. This is especially desirable when analyz- 
ing safety-critical systems. Third, AIR is not tied to any specific high-level programming language, and 
consequently is more versatile than source-code verification. In particular, AIR is able to sidestep fea- 
tures, such as pointers, structures, and object-orientation, which are problematic for source-level analysis 
tools. Finally, since AIR decompilation is semantics preserving and targets the C language, it enables us 
to leverage existing and emerging C analysis tools for verification. Thus, even though we experiment 
with SMC-based tools, other C verifiers (e.g., CBMC [7] and F-Soft [21]) are also applicable. 

We have implemented AIR and obtained encouraging results on several non-trivial benchmarks de- 
rived from Linux device drivers and an embedded OS. Further details can be found in Section 5. The rest 
of this paper is organized as follows. In Section 2, we survey related work. The decompilation and ver- 
ification stages of AIR are described in Sections 3 and 4, respectively. Finally, we present experimental 
results in Section 5 and conclude in Section 6. 

2 Related Work 

Following SLAM [3], several other projects - e.g., MAGIC [9] and BLAST [19] - have investigated the 
use of the SMC paradigm for C source code verification. A number of projects - such as SPIN [20], 
Java PathFinder [32], BANDERA [17], BOGOR [16], Behave! [4], and ZING [1] - have also looked at 
software verification, but not necessarily via SMC. Instead, their focus has been on other languages, such 
as Java, and other program features, such as concurrency. Decompilers have been traditionally developed 
for binary understanding and reverse engineering, and not for verification per se. Nevertheless, the use 
of decompilation for verification has been suggested by Breuer and Bowen [6], and by Curzon [14] for 
verifying micro-code. 

The verification of low-level software [12, 27] has also received a lot of attention. A number of ap- 
proaches are based on either theorem proving, type checking, or static analysis. For example, Boyer and 
Yu have verified object code for the MC68020 processor using the Nqthm theorem prover [5]. Yu [33] 
has proposed the use of certified assembly programming and type preserving translations for ensuring 
the safety of low-level code. His techniques are powerful but require considerable manual intervention, 
e.g., via type annotations and the use of proof assistants. Yu and Shao [34] have also proposed a logic 
based type system for the static verification of concurrent assembly programs. 

Reps et al. [30] have used static analysis algorithms to recover information about the contents of 
memory locations and how they are manipulated by executables. They have also created CodeSurfer/x86, 
a prototype tool for browsing, inspecting and analyzing x86 binaries. Our technique is based on model 
checking, is completely automated, and targets PowerPC™ assembly code. Balakrishnan et al. [2] 
have used model checking to analyze stripped device driver executables. Their approach is not based on 
decompilation to C, but on a tight combination of their own model checker and control-flow-graph-based 
internal representation for the target executables. 


37 



Software Model Checking without Source Code 


Chaki and Ivers 


Another approach for ensuring the correctness of low-level programs is source code verification 
combined with compiler validation, i.e., proving that the compiler always produces a target code which 
correctly implements the source code. In practice, proving compiler correctness is extremely tedious. 
Furthermore, any change in the compiler necessitates its revalidation. Our technique is impervious to the 
underlying compiler technology. A complementary technique is translation validation [28] where instead 
of validating the compiler, each individual run of the compiler is followed by a validation phase which 
verifies that the target code produced on that run correctly implements the submitted source program. 
Both compiler validation and translation validation assume that the source code is available and has been 
independently verified. Our approach does not require such an assumption. 

3 Decompilation 

The first stage of AIR is the decompilation of the target assembly program A into an equivalent C program 
P. In this section we describe the decompilation procedure using a small assembly program as a running 
example. For ease of understanding, we start with the source code from which A was compiled. Fig. 1 
shows a small C code fragment Pq on the left and the result of compiling it with gcc on the right. Pq is 
derived from MICRO-C, a lightweight operating system for real-time embedded applications. For brevity 
and simplicity we eliminated some irrelevant code from the actual MICRO-C sources. Also, the assembly 
A on the right of Fig. 1 does not contain some book-keeping information generated by gcc. 

We use the 32-bit variant of the PowerPC™ instruction set architecture (ISA) and assume little- 
endian mode. The PowerPC™ architecture defines 32 32-bit general purpose registers (GPRs) - re- 
ferred to in A as °/ 0 rO through 7«r31. In addition, there are 32 64-bit floating point registers (FPRs) - 
referred to as %f 0 through 7„f 31 - and a few special registers (SPRs), e.g., condition register (7,cr), 
link register (7,lr), etc. An assembly program consists of a set of blocks, each beginning with a 
label. A label is either a procedure name (OSMemNameSet) or begins with a dot (.LI ... .L4). 

The procedures OS_ENTER_CRITICAL and OS_EXIT_CRITICAL acquire and release a global lock for 
achieving mutual exclusion. We wish to verify whether our program satisfies the following property: 
(Safety) OS_ENTER_CRITICAL and OS_EXIT_CRITICAL are invoked alternately, beginning with a call to 
OS_ENTER_CRITICAL. Note that Safety is representative of a general class of safety specifications with 
respect to the acquisition and release of resources. Also, our example program does not satisfy Safety. 
Indeed, if the conditions of the first two if statements are both satisfied, then OS_EXIT_CRITICAL gets 
called twice in a row without any intervening call to OS_ENTER_CRITICAL. One possible fix for this 
problem is to add a return statement as indicated by the comment in the C code. 

3.1 From assembly to C 

The decompilation process converts the assembly program A to a C program, using the following general 
strategy: 

• The 32 GPRs are declared as global int variables rO through r31. The 32 FPRs are declared as 
global double variables fO through f31. The SPRs are also declared as int variables cr, lr 
and so on. All integer data is assumed to be in signed (two’s complement) 32-bit format and all 
double data is assumed to be in IEEE 64-bit double precision format. 

• Each label corresponding to a procedure name yields a procedure declaration. Since an assem- 
bly program passes and returns all values via registers (i.e., global variables), our procedures are 
void-void, i.e., they have no parameters or return values. In our example, we obtain a single 
procedure declaration: 

void OSMemNameSet (void) 
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struct os_mem { 
void *OSMemAddr ; 
void *OSMemFreeList ; 
unsigned int OSMemBlkSize ; 
unsigned int OSMemNBlks ; 
unsigned int OSMemNFree ; 
char OSMemName [32] ; 

>; 


typedef struct os_mem 0S_MEM; 

void OSMemNameSet (0S_MEM *pmem, 

char *pname , unsigned char *err ) 

unsigned char len; 

OS_ENTER_CRITICAL ( ) ; 

if ((unsigned int )pmem == (unsigned int )((0S_MEM *)0)) { 
OS_EXIT_CRITICAL() ; 

(*err) = 116; 

//bug : there should most likely be a return here 
//return; 

> 

if ((unsigned int )pname == (unsigned int )((char *)0)) { 
OS_EXIT_CRITICAL() ; 

(*err) = 15; 
return; 

> 

if ((int )len > 31) { 

OS_EXIT_CRITICAL() ; 

(*err) = 119; 
return; 

> 

OS_EXIT_CRITICAL ( ) ; 

(*err) = 0; 
return; 


OSMemNameSet : 

stwu y 0 rl,-48(°/,rl) 
mflr °/«r0 
stw # /,r31,44(°/„rl) 
stw # / 0 rO,52( 0 / 0 rl) 
mr °/ 0 r31 , °/«r 1 
stw # / 0 r3,8(°/«r31) 
stw °/ 0 r4, 12(°/ 0 r31) 
stw °/ 0 r5, 16( , / 0 r31) 
bl OS_ENTER_CRITICAL 
lwz # / 0 rO,8( 0 /.r31) 
cmpwi °/ 0 cr7 , °/ 0 rO , 0 
bne # /«cr7 , . L2 
bl OS_EXIT_CRITICAL 
lwz °/ 0 r9, 16(°/ 0 r31) 
li °/ 0 rO,116 
stb # / 0 rO,0( 0 / 0 r9) 

. L2 : 

lwz # / 0 rO,12(°/,r31) 
cmpwi °/ 0 cr7 , °/ 0 rO , 0 
bne # /«cr7 , . L3 
bl OS_EXIT_CRITICAL 
lwz 0 /„r9,16( , /,r31) 
li °/ 0 rO,15 
stb # / 0 rO,0( 0 / 0 r9) 
b .LI 
.L3 : 

lbz # /.r0,20(°/„r31) 
rlwinm 0 / 0 rO,°/,rO,0,Oxff 
cmplwi 0 / 0 cr7, 0 / 0 rO,31 
ble # /«cr7,.L4 
bl OS_EXIT_CRITICAL 
lwz °/ 0 r9, 16(°/ 0 r31) 
li °/ 0 rO,119 
stb # / 0 rO,0( 0 / 0 r9) 
b .LI 
. L4 : 

bl OS_EXIT_CRITICAL 
lwz °/ 0 r9,16( # /„r31) 
li °/.r0,0 
stb # / 0 rO,0( 0 /«r9) 

.LI: 

lwz # /,rll,0(°/,rl) 
lwz # / 0 rO,4( 0 / 0 rll) 
mtlr %r0 

lwz # /.r31,-4(°/,rll) 
mr °/.rl,°/„rll 
blr 


Figure 1 : A running example. 


• Each label beginning with a dot results in a corresponding label in the C program. We strip off 
the initial dot to conform to valid ANSI-C syntax. Thus, the C program generated in our example 
contains four labels LI through L4. 

• Each assembly instruction gets translated to an equivalent sequence of C statements. In the rest 
of this section, we describe the translation process for the instructions that appear in our example. 
Note that the size of the resulting C program is linear in the size of the input assembly program. 

3.2 Translating assembly instructions 

PowerPC™ follows the Reduced Instruction Set Computer (RISC) or the load-store paradigm. Thus, 
there are no arithmetic, logical, or control-flow instructions that operate directly on data stored in mem- 
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ory. All operations are performed on GPRs, FPRs, and SPRs. In order to operate on memory data, the 
operands are loaded explicitly into registers, and the result is stored explicitly back to memory. 


Assembly 

C statements 

Loads and stores 

lwz •/.r0,8("/,r31) 

rO = *((int*) (r31 + 8) ) ; 

li %r0,116 

rO = 116; 

lbz , /,r0,20('/,r31) 

rO = (*((int*) (r31 + 20))) k (Oxff ) ; 

stwu '/,rl,-48('/,rl) 

*((int*)(rl - 48)) = rl; rl = rl - 48; 

stw •/.r31,44(*/.rl) 

*((int*)(rl + 44)) = r31; 

stb 0 / 0 rO,0( 0 / 0 r9) 

*((int*) (r9 + 0)) = 


((*((int*)(r9 + 0))) k OxffffffOO) I (rO k Oxff); 

Register operations 

mr '/,r31,'/,rl 

r31 = rl; 

mflr ’/.rO 

rO = lr; 

mtlr %r0 

lr = rO; 

rlwinm °/ 0 rO , °/«r0 , 0 , 255 

rO = (((rO » 32) k 0) I ((rO « 0) k Oxffffffff)) 


k (Oxff); 

cmpwi °/ 0 cr7 , °/ 0 rO , 0 

cr = (rO < 0) ? (cr I 0x8) : (cr k Oxffff ff f7) ; 


cr = (rO > 0) ? (cr I 0x4) : (cr k Oxfffffffb) ; 


cr = (rO == 0) ? (cr I 0x2) : (cr k Oxfffffffd) ; 

cmplwi 0 / 0 cr7, 0 / 0 rO,31 

cr = (rO >= 0) kk (rO < 31) ? 


(cr | 0x8) : (cr k 0xfffffff7) ; 


cr = (rO < 0) || (rO > 31) ? 


(cr | 0x4) : (cr k Oxfffffffb); 


cr = (rO >= 0) kk (rO == 31) ? 


(cr | 0x2) : (cr k Oxfffffffd) ; 

Conditional and unconditional jumps 

b .LI 

goto LI; 

ble °/ 0 cr7 , . L4 

if(!(cr & 0x4)) goto L4; 

bne °/ 0 cr7 , . L2 

if(!(cr & 0x2)) goto L2; 

bl OS_ENTER_CRITICAL 

0S_EWTER_CRITICAL() ; 

blr 

return; 


Figure 2: Translation schema from assembly instructions to C statements. 

Fig. 2 shows a table with assembly instructions on the left and the corresponding C statements on 
the right. Among the instructions in Fig. 2, the following have straightforward translations: li = load 
immediate, mr = move register, mflr = move from link register, mtlr = move to link 
register, b = branch, bl = branch link, and blr = branch link return. The translations 
for the other instructions can be understood on the basis of their semantics, as described below: 

• lwz = load word and zero: loads a word from the memory location denoted by the second 
argument to the register denoted by the first argument. 

• lbz = load byte and zero: loads a byte from the source memory location S denoted by the 
second argument to the target register T denoted by the first argument. The higher-order 24 bits of 
T are set to zero. Due to little-endianness, if we load an integer from S, our desired byte will be 
laid out at the lower order end. Now we have to zero out the higher-order 24 bits. 

• stwu = store word with update: stores the word in source register S denoted by the first 
argument to the target memory location T denoted by the second argument, and then sets the value 
of S to T. 

• stw = store word: stores the word in source register S denoted by the first argument to the 
target memory location T denoted by the second argument. 

• stb = store byte: stores the lowest byte in source register S denoted by the first argument to 
the target memory location T denoted by the second argument. Again due to little-endianness, we 
load the current word stored at T, replace its lowest byte with the lowest byte of S, and store the 
new value back to T. 
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• rlwinm = rotate left word immediate then AND with mask: rotates left the contents of 
the source register S (denoted by the second argument) by the number of bits B (denoted by the 
third argument), then logically ANDs the rotated data with a 32-bit mask BM (denoted by the 
fourth argument), and stores the result in the target register T (denoted by the first argument). 

To understand the comparison and jump instructions, we note that the condition register cr is logi- 
cally partitioned into eight sub-registers crO . . . cr7. The sub-registers are numbered from the higher 
order bits to the lower order bits of cr as shown by the following diagram. 


| crO | crl | cr2 | cr3 | cr4 | cr5 | cr6 | cr7 | 


Thus, cr7 denotes the lowest four bits of cr. Further, suppose that the results of a comparison 
between X and Y are stored in a condition sub-register R. Then the bits of R must be interpreted as 
follows. The highest bit is 1 if and only if X < Y. the next bit is 1 if and only if X > Y, and the next bit is 
1 if and only if X — Y. The lowest bit is reserved for overflows. We now present the translation scheme 
for the remaining instructions. 

• cmpwi = compare word immediate: compares the contents of the register denoted by the sec- 
ond argument with the integer denoted by the third argument treating both values as signed inte- 
gers, and stores the result in the condition sub-register denoted by the first argument. 

• cmplwi = compare logical word immediate: compares the contents of the register denoted 
by the second argument with the integer denoted by the third argument treating both values as 
unsigned integers, and stores the result in the condition sub-register denoted by the first argument. 
Since all our C variables are signed, we guard the C statement to be executed on conditions that 
check for negative values in addition to the actual comparison being performed. 

• ble = branch less equal : jumps to the label denoted by the second argument if the condition 
sub-register denoted by the first argument indicates a "less or equal” comparison result. 

• bne = branch not equal : jumps to the label denoted by the second argument if the condition 
sub-register denoted by the first argument indicates a "not equal” comparison result. 

This completes the description of the translation scheme for the instructions appearing in our exam- 
ple. In total, to perform our experiments, translations were written for eighty nine PowerPC assembly 
instructions. Complete details of the PowerPC™ ISA are available [29] online. 

4 Verification 

Once the target assembly program has been decompiled to a C program P, the second stage of AIR 
involves the verification of P via SMC. For our experiments, we used the COPPER [13] SMC tool for 
verification. For the purpose of AIR we only required the ability of Copper to check for trace contain- 
ment between sequential C programs and finite state machines. In addition, though our familiarity with 
Copper lead to its use in our experiments, any other C verification tool based on the SMC paradigm, 
such as SLAM [3], MAGIC [9], or BLAST [19], is suitable for use in the verification stage of AIR. 

Indeed, the main challenge involved in the use of SMC for AIR verification is tool-independent, and 
arises from the need for precise handling of bit-level semantics during SMC 1 . In all cases, the handling of 
bit-level semantics is delegated to the theorem prover used during predicate abstraction. Most often, the 

1 We note that the C bounded model checker CBMC [10) does obey precise bit-level semantics but does not use SMC. 
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theorem prover (usually Simplify [26] or Vampyre [31]) treats the C bit-wise operators as uninterpreted 
functions. For source code verification, this is not a major roadblock since many properties verified 
on source code do not rely on the precise interpretation of bitwise operators. However, in the case of 
AIR, precise interpretation of bitwise operations is crucial for verifying the C programs generated via 
decompilation. In our initial experiments, not a single non-trivial property could be verified by leaving 
the bitwise operators uninterpreted. We attempted several solutions to this problem, as discussed next. 


Solution 1: Adding axioms. First, we added extra axioms about C bitwise operators to assist Simplify, 
the default theorem prover used by Copper. Unfortunately, this solution is ad hoc, since we had no way 
of knowing if we had added enough axioms. Also, the performance of Simplify, in terms of both time 
and memory consumption, degraded dramatically with increasing numbers of axioms. Ultimately, we 
concluded that this approach would not scale to realistic programs. 


Solution 2: Syntactic analysis. Next, we augmented COPPER with a set of syntactic bit-level analyses. 
Specifically, before invoking Simplify, Copper performs some simplifications on the formula whose 
validity has to be checked. The transformations are targeted at specific patterns that arise in formulas due 
to the structure of assembly programs. For example, a common query to Simplify is the validity of ( (E 
I 0x4) » 2) & (0x1), where E is some C expression. Our technique is able to convert such formulas 
to 0x1, whose validity can then be easily decided by Simplify. We call this solution uninterpreted since 
all bitwise operators are left completely uninterpreted by the theorem prover. 


Solution 3: Using a bit-vector decision procedure. We also compared the above approach to the idea 
of replacing Simplify with the bit-vector decision procedure CProver [22] (we also experimented with 
CVC Lite [15] but found CProver to be faster). We tried two variations of this idea. In the interpreted 
variation, all formulas containing bitwise operators are solved using CPROVER. In the semi-interpreted 
variation, formulas containing bitwise operators are first solved using Simplify. CPROVER is used only 
if Simplify is unable to decide validity conclusively. 

In our example, AIR is able to successfully report the bug in MICRO-C. When the bug is fixed, in 
accordance with the suggestion in the comment, AIR successfully verifies the safety property. Also our 
experiments indicate that the uninterpreted approach yields the best performance over a set of realistic 
benchmarks. We now present full details of the empirical validation of our technique. 


5 Experimental Validation 

We experimented with a set of benchmarks derived from MICRO-C and Linux device drivers. All our 
experiments were performed on a single core 2.4 GHz Pentium computer running RedHat 9. We imposed 
a time limit of one hour, and memory limit of one GB. We derived a set of eleven benchmarks - one from 
MICRO-C, and the rest from Linux 2.6. 11.10 kernel drivers - by compiling C source code with gcc-3 . 2. 
For each example, we checked that a certain “lock” was being acquired and released properly. The nature 
of the lock varied with the example. For MICRO-C, the lock was an invocation of OS_ENTER_CRITICAL, 
while for the Linux drivers it was a call to spin_lock, spin_lock_irq or spin_lock_irqsave. The 
"unlock” was derived accordingly. 

We initially observed that COPPER is easily able to verify the safety property for all our benchmarks 
because the locks and unlocks are paired up syntactically. In other words, an analysis of the control 
flow graph suffices and no further predicate abstraction is necessary. To make our benchmarks more 
interesting, we added data dependencies between the locks and unlocks. Essentially we guarded the 
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T 
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I 

CE 

T 
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10 
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- 

* 
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- 
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70 

31 

- 

* 

67 

- 
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33 

2319 

121 

16 

* 

107 

- 

198 

74 

16 

- 

* 

141 

- 

DAC960 

45 

* 

193 

- 

* 

142 

- 

520 

107 

16 

X 

61 

128 

6278 

devices 

17 

1018 

41 

17 

3523 

45 

18 

350 

40 

19 

X 

202 

60 

96701 

ide 

26 

510 

61 

15 

557 

61 

15 

25 

34 

15 

- 

* 

105 

- 

ipr 

35 

* 

285 

- 

* 

288 

- 

310 

79 

19 

X 

30 

100 

1230 

message 

21 

194 

28 

26 

145 

28 

25 

29 

27 

26 

X 

16 

63 

831 

mxser 

22 

699 

53 

19 

547 

52 

19 

129 

46 

19 

X 

6 

51 

1302 

synclink 

34 

123 

33 

15 

106 

33 

15 

15 

30 

15 

X 

40 

89 

4292 

tg3 

61 

988 

77 

18 

907 

77 

18 

168 

70 

21 

X 

84 

152 

11496 

tlan 

31 

312 

63 

7 

242 

63 

7 

73 

49 

7 

X 

25 

88 

2062 


Figure 3: Results for non-buggy benchmarks. KLOC = 1000 lines of assembly; T = time in seconds; 
M = memory in MB; I = # of iterations; * means that the resource was exhausted; - means that no 
measurement was available; x means that the counterexample returned by BLAST is spurious. Best 
figures are highlighted. 
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I 
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10 

* 
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- 

* 
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- 

* 

70 

- 

- 

* 

67 

- 
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33 
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53 

10 
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53 

10 

84 

49 

18 

X 

40 

98 
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45 
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13 
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13 
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13 

X 

70 
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17 
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14 

* 

46 
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* 

43 

- 

X 
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59 
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26 

* 

75 

- 
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67 

- 

* 

39 

- 

- 

* 
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- 
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35 
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280 

6 
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6 
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76 

6 
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37 

99 
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21 

62 

26 

11 

76 

26 

11 

6 

24 

11 

X 

16 

63 
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Figure 4: Results for buggy benchmarks. KLOC = KLOC of assembly; T = time in seconds; M = memory 
in MB; I = # of iterations; * means that the resource was exhausted; - means that no measurement was 
available; x means that the counterexample returned by BLAST is spurious. Best figures are highlighted. 


locks and unlocks with a non-deterministic value. Since the same value guards both lock and unlock the 
examples are still correct. 

We experimented using the interpreted, semi-interpreted and uninterpreted approaches presented in 
Section 4. In the first two cases, the syntactic simplifications were also applied. As a control, we also 
used BLAST version 1 .0. The results of our experiments with these benchmarks are summarized in Fig. 3. 
Next, for each benchmark, we created a buggy version by artificially inserting errors and repeated our 
experiments. The results for our experiments with the the buggy examples are summarized in Fig. 4. 

We observe that the uninterpreted approach exhibits the best overall performance. The interpreted 
approach beats the semi-interpreted approach by successfully proving more examples. This indicates 
that almost all the formulas involving bitwise operators could not be proved by Simplify and hence had 
to be further delegated to CPROVER. This is also consistent with our initial failure with only Simplify 
(without the syntactic simplifications). BLAST returns counterexamples for both the correct and buggy 
examples. Upon closer inspection, all counterexamples returned by BLAST are found to be spurious. We 
note that this is essentially due to blast’s dependency on Simplify. 
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6 Discussion and Conclusion 

We have presented AIR, a framework for verifying safety properties of assembly language programs via 
SMC. We have proposed a number of approaches for more precise handling of bit-level semantics during 
SMC and empirically validated their relative effectiveness. Overall, our experiments indicate that AIR is 
effective on real-life benchmarks derived from an embedded OS and Linux device drivers. 

It is worthwhile to consider a few issues concerning the AIR approach. Decompiling an assembly 
program, though much less difficult than verifying the resulting C program, requires careful attention 
to detail. Whether the target platform is big or little endian and whether a 0 or a 1 is shifted in on » 
operations on signed integers are two such intricacies. Correctly modeling elements of the program’s 
environment such as the contents of the PowerPC™ machine state register are more complicated. Other 
decisions must be guided by the capabilities of the model checker; for example, choosing whether to 
denote a comparison that treats two operands as unsigned quantities by using type casting and a simple 
comparison or by using more predicates and checking different conditions depending on the signs of the 
operands. Moreover, the correct handling of pointer aliasing during verification is crucial for maintaining 
the overall soundness of AIR. Finally, though AIR is applicable to any assembly program, it is not 
necessarily a good choice in many cases. The broad applicability of AIR comes at a cost in usability. 
Encoding properties in terms of elements of the assembly program may be more difficult then encoding 
the same property against, for example, a C program. Similarly, interpreting a counterexample expressed 
in terms of an assembly program is likely to be more difficult. However, these concerns are much less 
important when it is sufficient to know whether a property holds (or not), or when source is unavailable. 

In summary, we believe that the AIR approach has important ramifications for the development of 
effective low-level software verification techniques. Specifically, AIR is applicable to verifying other 
low-level languages such as Java bytecode and MSIL. Programs in these languages generally contain ad- 
ditional information (such as variable names) that should further increase air’s effectiveness. AIR is also 
adoptable for the purpose of using certifying model checking [23] for proof carrying code (PCC) [25]. 
Certifying model checking in combination with abstraction has been used [24, 8] to construct invariants 
and ranking functions for the purpose of certifying source code. By generating source code from binaries, 
AIR enables us to leverage the above technology for the PCC-style certification of binaries. Finally, there 
is a growing trend of implementing hardware functionality using software, such as microcode, in the do- 
main of hardware-software co-design. We believe that AIR would also be applicable for the verification 
of such low-level programs. 
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